PDNet: Semantic Segmentation integrated with a Primal-Dual Network for Document binarization
نویسندگان
چکیده
Binarization of digital documents is the task of classifying each pixel in an image of the document as belonging to the background (parchment/paper) or foreground (text/ink). Historical documents are often subject to degradations, that make the task challenging. In the current work a deep neural network architecture is proposed that combines a fully convolutional network with an unrolled primal-dual network that can be trained end-to-end to achieve state of the art binarization on four out of seven datasets. Document binarization is formulated as an energy minimization problem. A fully convolutional neural network is trained for semantic labeling of pixels to provide class labeling cost associated with each pixel. This cost estimate is refined along the edges to compensate for any over or under estimation of the foreground class using a primal-dual approach. We provide necessary overview on proximal operator that facilitates theoretical underpinning required to train a primal-dual network using a gradient descent algorithm. Numerical instabilities encountered due to the recurrent nature of primal-dual approach are handled. We provide experimental results on document binarization competition dataset along with network changes and hyperparameter tuning required for stability and performance of the network. The network when pre-trained on synthetic dataset performs better as per the competition metrics.
منابع مشابه
Document Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملSemantic Labeling using Convolutional Networks coupled with Graph-Cuts for Document binarization
Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations making mathematical modeling of appearance of the text, background and all kinds of degradations challenging. In the current work we try to tackle binarization as pixel classification problem. We firs...
متن کاملHistorical Document Binarization Combining Semantic Labeling and Graph Cuts
Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of the appearance of the text, as well as the b...
متن کاملDocument Binarization Combining with Graph Cuts and Deep Neural Networks
Most data mining applications on collections of historical documents require binarization of the digitized images as a pre-processing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of the appearance of the text, as well as the b...
متن کاملAutomatic road crack detection and classification using image processing techniques, machine learning and integrated models in urban areas: A novel image binarization technique
The quality of the road pavement has always been one of the major concerns for governments around the world. Cracks in the asphalt are one of the most common road tensions that generally threaten the safety of roads and highways. In recent years, automated inspection methods such as image and video processing have been considered due to the high cost and error of manual metho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.08694 شماره
صفحات -
تاریخ انتشار 2018